The library, uWaveGestureLibrary, consists over 4000 instances each of which has the accelerometer readings in three dimensions (i.e. x, y and z). This data is the gesture pattern of eight users. All rows are an instance. First column is the class of the gesture, which are stated from one to eight. The columns from second to last are the time ordered observations in the corresponding axis. There are 3 files that are related to X, Y and Z axes.
In this assignment, we want to apply principle component analysis (PCA) to be able to describe the 3D information in 1D. Also, we try to reduce the dimension of the time order from 315 to 2. So, we can use much more less column to describe the gesture of the user. Throughout the assignment, we will use numpy, pandas, datatable, matplotlib.pyplot, plotly.graph_objects, random, sklearn.decomposition.PCA, sklearn.manifold.MDS and sklearn.metrics.pairwise.manhattan_distances packages.
## Importing Packages
import numpy as np
import pandas as pd
from datatable import dt, by
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import random
from sklearn.decomposition import PCA
from sklearn.manifold import MDS
from sklearn.metrics.pairwise import manhattan_distances
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline
Following functions are created in order to accomplish the objective. First function, prepare_data, is used to import the data and rename the columns. Second function, transform_long, is used to transform the wide data into long data, which is called melting the data. Third function, cal_pos, is used to calculate the position information from accelerometer information, which taking the cumulative summation of the cumulative summation of accelerometer. Fourth function, pca_plot, is used to plot the first component of the PCA result. In the plot, we randomly pick 2 instances of the class. The fifth function, pca_class, is used to apply PCA to each class separately.
def prepare_data(path, ax):
df = pd.read_csv(path, sep = '\s+', header = None)
columns = [ax + 'T' + str(i) for i in range(1, df.shape[1])]
columns.insert(0, 'class')
df.columns = columns
df['class'] = df['class'].astype(int)
return df
def transform_long(df, ax):
df2 = df.copy()
df2['time series'] = range(1, df2.shape[0] + 1)
data_long = pd.melt(df2,
id_vars = ['class', 'time series'],
value_vars = list(df2.columns[1:df2.shape[1] - 1]),
var_name = 'time index',
value_name = ax)
data_long['time index'] = data_long['time index'].apply(lambda x: int(x[2:]))
data_long.sort_values(by = ['time series', 'time index'], ignore_index = True, inplace = True)
return data_long
def cal_pos(df, ax):
DT = dt.Frame(df)
x = DT[15, :, by("class")][:, 1:]
x = x.to_pandas()
y = x.transpose()
cum_sum_y = y.apply(lambda x: x.cumsum())
cum_sum_y_2 = cum_sum_y.apply(lambda x: x.cumsum())
columns1 = [ax + '_' + str(i) for i in range(1, y.shape[1] + 1)]
columns2 = ["vel_" + str(i) for i in range(1, cum_sum_y.shape[1] + 1)]
columns3 = ["pos_" + str(i) for i in range(1, cum_sum_y_2.shape[1] + 1)]
columns = columns1 + columns2 + columns3
y = pd.concat([y, cum_sum_y, cum_sum_y_2], axis=1)
y.columns = columns
return y
def pca_plot(value):
series = x_train_long.loc[x_train_long['class'] == value, 'time series'].unique()
random.seed(12345)
randoms = random.sample(set(series),2)
plot_data = data_long[data_long['time series'].isin(randoms)]
_ = plt.scatter(plot_data['time index'], plot_data['PCA1'], c = plot_data['time series'])
_ = plt.title("1D Representation of Data For Class" + str(value))
_ = plt.xlabel("Time Index")
_ = plt.ylabel("First Component")
def pca_class(value):
df = data_long[data_long['class'] == value]
pca = PCA(n_components = 1)
pca_result = pca.fit_transform(df.iloc[:, 2:6])
return (pca.explained_variance_ratio_)
With following steps, we imported the data, created a wide and long format of the data.
x_train = prepare_data("https://drive.google.com/uc?export=download&id=1KDhDT0FE5lkjvn62YTCJ87vZ7A5uS5TT", "X")
y_train = prepare_data("https://drive.google.com/uc?export=download&id=1fZCNBdJ40Df5werSu_Ud4GUmCBcBIfaI", "Y")
z_train = prepare_data("https://drive.google.com/uc?export=download&id=1jdZ2_NiFil0b4EbLBAfDJ43VQcOgulpf", "Z")
data_wide = pd.concat([x_train, y_train.iloc[:, 1:y_train.shape[1]], z_train.iloc[:, 1:z_train.shape[1]]], axis = 1)
x_train_long = transform_long(x_train, 'X')
y_train_long = transform_long(y_train, 'Y')
z_train_long = transform_long(z_train, 'Z')
x_y_merge = pd.merge(x_train_long, y_train_long, on = ["time series", "time index", "class"], how = 'left')
data_long = pd.merge(x_y_merge, z_train_long, on = ["time series", "time index", "class"], how = 'left')
cols = data_long.columns.tolist()
cols = cols[1:] + cols[:1]
data_long = data_long[cols]
These are the head of all data that we created.
print(x_train.head())
print(y_train.head())
print(z_train.head())
print(data_wide.head())
print(x_train_long.head())
print(y_train_long.head())
print(z_train_long.head())
print(data_long.head())
We can plot the data in 3D. To do so, we used the plotly package and coordinate information.
x_pos = cal_pos(x_train, 'X')
y_pos = cal_pos(y_train, 'Y')
z_pos = cal_pos(z_train, 'Z')
fig = go.Figure()
for i in range(1, 9):
cols = "pos_" + str(i)
fig = go.Figure(data=[go.Scatter3d(
x=x_pos.loc[:,cols],
y=y_pos.loc[:,cols],
z=z_pos.loc[:,cols],
mode='markers',
marker=dict(
size=12,
color='blue',
colorscale='Viridis',
opacity=0.8
)
)])
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0),
title=dict(text = "3D Representation of Accelerometer for Class " + str(i)))
fig.show()
Even if they are not the same, these plots are the graphical (3D) representation of the gestures shown below.

We can apply PCA to get 1D information from 3D. Before that we need to check the scale of the data. If they are not in the same scale, we need to standardize the data to make them in the same scale.
In this step, we are using the accelerometer information. Another possibility would be to use the position information of all instances.
print(x_train.iloc[:, 1:].apply(np.mean, axis = 1).head())
print(x_train.iloc[:, 1:].apply(np.std, axis = 1).head())
print(y_train.iloc[:, 1:].apply(np.mean, axis = 1).head())
print(y_train.iloc[:, 1:].apply(np.std, axis = 1).head())
print(z_train.iloc[:, 1:].apply(np.mean, axis = 1).head())
print(z_train.iloc[:, 1:].apply(np.std, axis = 1).head())
When we look at the mean and standard deviation of all data, they all have zero mean and 1 standard deviation, which means that the data is ready for PCA.
pca = PCA(n_components = 1)
pca_result = pca.fit_transform(data_long.iloc[:, 2:5])
print(pca.explained_variance_ratio_)
data_long["PCA1"] = pca_result
When we observe the summary output, we see that with one principal component we can almost explain the 50% of the data. We can plot them with time index information. To be able to compare their information, we need to use the 3D scatter plot of the accelerometer information. If they look the same, we can say that to reduce the data from 3D to 1D is a good method.
pca_plot(1)
fig = go.Figure(data=[go.Scatter3d(
x=x_pos.loc[:,"X_1"],
y=y_pos.loc[:,"Y_1"],
z=z_pos.loc[:,"Z_1"],
mode='markers',
marker=dict(
size=12,
color='blue',
colorscale='Viridis',
opacity=0.8
)
)])
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0),
title=dict(text = "3D Representation of Accelerometer for Class 1"))
fig.show()
From the plot above, we can not say that they are similar. But, when we compare the two random data points within each other, we can say that they have similar information.
pca_plot(2)
fig = go.Figure(data=[go.Scatter3d(
x=x_pos.loc[:,"X_2"],
y=y_pos.loc[:,"Y_2"],
z=z_pos.loc[:,"Z_2"],
mode='markers',
marker=dict(
size=12,
color='blue',
colorscale='Viridis',
opacity=0.8
)
)])
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0),
title=dict(text = "3D Representation of Accelerometer for Class 2"))
fig.show()
From the plot above, we can not say that they are similar. But, when we compare the two random data points within each other, we can say that they have similar information.
pca_plot(3)
fig = go.Figure(data=[go.Scatter3d(
x=x_pos.loc[:,"X_3"],
y=y_pos.loc[:,"Y_3"],
z=z_pos.loc[:,"Z_3"],
mode='markers',
marker=dict(
size=12,
color='blue',
colorscale='Viridis',
opacity=0.8
)
)])
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0),
title=dict(text = "3D Representation of Accelerometer for Class 3"))
fig.show()
From the plot above, we can not say that they are similar. But, when we compare the two random data points within each other, we can say that they have similar information.
pca_plot(4)
fig = go.Figure(data=[go.Scatter3d(
x=x_pos.loc[:,"X_4"],
y=y_pos.loc[:,"Y_4"],
z=z_pos.loc[:,"Z_4"],
mode='markers',
marker=dict(
size=12,
color='blue',
colorscale='Viridis',
opacity=0.8
)
)])
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0),
title=dict(text = "3D Representation of Accelerometer for Class 4"))
fig.show()
From the plot above, we can not say that they are similar. But, when we compare the two random data points within each other, we can say that they have similar information.
pca_plot(5)
fig = go.Figure(data=[go.Scatter3d(
x=x_pos.loc[:,"X_5"],
y=y_pos.loc[:,"Y_5"],
z=z_pos.loc[:,"Z_5"],
mode='markers',
marker=dict(
size=12,
color='blue',
colorscale='Viridis',
opacity=0.8
)
)])
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0),
title=dict(text = "3D Representation of Accelerometer for Class 5"))
fig.show()
From the plot above, we can not say that they are similar. But, when we compare the two random data points within each other, we can say that they have similar information.
pca_plot(6)
fig = go.Figure(data=[go.Scatter3d(
x=x_pos.loc[:,"X_6"],
y=y_pos.loc[:,"Y_6"],
z=z_pos.loc[:,"Z_6"],
mode='markers',
marker=dict(
size=12,
color='blue',
colorscale='Viridis',
opacity=0.8
)
)])
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0),
title=dict(text = "3D Representation of Accelerometer for Class 6"))
fig.show()
From the plot above, we can not say that they are similar. But, when we compare the two random data points within each other, we can say that they have similar information.
pca_plot(7)
fig = go.Figure(data=[go.Scatter3d(
x=x_pos.loc[:,"X_7"],
y=y_pos.loc[:,"Y_7"],
z=z_pos.loc[:,"Z_7"],
mode='markers',
marker=dict(
size=12,
color='blue',
colorscale='Viridis',
opacity=0.8
)
)])
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0),
title=dict(text = "3D Representation of Accelerometer for Class 7"))
fig.show()
From the plot above, we can not say that they are similar. But, when we compare the two random data points within each other, we can say that they have similar information.
pca_plot(8)
fig = go.Figure(data=[go.Scatter3d(
x=x_pos.loc[:,"X_8"],
y=y_pos.loc[:,"Y_8"],
z=z_pos.loc[:,"Z_8"],
mode='markers',
marker=dict(
size=12,
color='blue',
colorscale='Viridis',
opacity=0.8
)
)])
fig.update_layout(margin=dict(l=0, r=0, b=0, t=0),
title=dict(text = "3D Representation of Accelerometer for Class 8"))
fig.show()
From the plot above, we can not say that they are similar. But, when we compare the two random data points within each other, we can say that they have similar information.
This is the expected result. Because, as in the summary of the PCA, one component can only explain the half of the data. So, we lose some information in this process.
Another approach would be to apply the PCA process for every class individually. We can expect to get better result with this approach, because we can reduce the variance among the classes.
exp_var_rat1 = pca_class(1)
print(exp_var_rat1)
When we look at the summary, we see that 46% of the data can be explained with one component. It is worse than the general PCA result if we want to explain the class 1.
exp_var_rat2 = pca_class(2)
print(exp_var_rat2)
When we look at the summary, we see that 51% of the data can be explained with one component. It is better than the general PCA result if we want to explain the class 2.
exp_var_rat3 = pca_class(3)
print(exp_var_rat3)
When we look at the summary, we see that 54% of the data can be explained with one component. It is better than the general PCA result if we want to explain the class 3.
exp_var_rat4 = pca_class(4)
print(exp_var_rat4)
When we look at the summary, we see that 55% of the data can be explained with one component. It is better than the general PCA result if we want to explain the class 4.
exp_var_rat5 = pca_class(5)
print(exp_var_rat5)
When we look at the summary, we see that 65% of the data can be explained with one component. It is better than the general PCA result if we want to explain the class 5.
exp_var_rat6 = pca_class(6)
print(exp_var_rat6)
When we look at the summary, we see that 57% of the data can be explained with one component. It is better than the general PCA result if we want to explain the class 6.
exp_var_rat7 = pca_class(7)
print(exp_var_rat7)
When we look at the summary, we see that 52% of the data can be explained with one component. It is better than the general PCA result if we want to explain the class 7.
exp_var_rat8 = pca_class(8)
print(exp_var_rat8)
When we look at the summary, we see that 61% of the data can be explained with one component. It is better than the general PCA result if we want to explain the class 8.
In general, we can say that applying PCA for every class gives better result than applying to all class. This approach is applicable for this data, because we have only eight class. If we have many more class than eight, it would be more time consuming process.
When we look at the data, we can see that there are 315 time ordered information. This is a huge number to consider as feature number in a model. So, to reduce the number of feature is a good approach. To do so, we can use multi dimensional scaling (MDS) approach. Before this approach, we need to have distance matrix of the data. We can get distance matrix with dist function. We used manhattan method as calculating the distance, because we need to sum all distances of axes and this summation will be in a way like manhattan. So, to maintain the consistency, we should use the manhattan method. If we want to use euclidean method, we need to sum all axes' distance in euclidean method, too.
x_train_distance = manhattan_distances(x_train.iloc[:, 1:317])
y_train_distance = manhattan_distances(y_train.iloc[:, 1:317])
z_train_distance = manhattan_distances(z_train.iloc[:, 1:317])
data_distance = x_train_distance + y_train_distance + z_train_distance
data_distance.shape
Now, we have a 896 x 896 distance matrix. After preparing the data, we can apply the MDS process and plot the result with respect to classes.
embedding = MDS(n_components=2)
mds = embedding.fit_transform(data_distance)
mds = pd.DataFrame({"class": x_train.iloc[:,0].values, "D1": mds[:,0], "D2": mds[:,1]})
_ = plt.scatter(mds['D1'], mds['D2'], c = mds['class'])
_ = plt.title("Result of the MDS Process")
_ = plt.xlabel("First Dimension")
_ = plt.ylabel("Second Dimension")
When we look at the plot, we can see some interesting parts. Recall the classes 3 and 4, which are the opposite way of themselves. When we look at the plot, instances of class 3 have positive values in the first dimension and negative values in the second dimension. From class 4 respective, it has negative values in the first dimension and positive values in the second dimension. They are still negatively correlated in this reduced form. We can see the similar proposition between 5 & 6 and 7 & 8. Also, all members of any classes are plotted in the same region. So, as a result, this MDS process is a successful process.